Lower-bound Estimate for Cost-sensitive Decision Trees

نویسنده

  • Mikhail V. Goubko
چکیده

While an extensive body of literature investigates problems of decision trees growing, just a few study lower-bound estimates for the expected classification cost of decision trees, especially for varying costs of tests. In this paper the new lower-bound estimate is proposed. Computation of the estimate is reduced to solving a series of set-covering problems. Computational complexity and other properties of the lower-bound estimate are investigated. The top-down algorithm of tree construction based on the proposed estimate is tested against several popular greedy cost-sensitive heuristics on a range of standard data sets from UCI Machine Learning Repository.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

An Upper Bound on the First Zagreb Index in Trees

In this paper we give sharp upper bounds on the Zagreb indices and characterize all trees achieving equality in these bounds. Also, we give lower bound on first Zagreb coindex of trees.

متن کامل

Solving the Paradox of Multiple IRR\'s in Engineering Economic Problems by Choosing an Optimal -cut

Until now single values of IRR are traditionally used to estimate the time value of cash flows. Since uncertainty exists in estimating cost data, the resulting decision may not be reliable. The most commonly cited drawbacks to using the internal rate of return in evaluatton of deterministic cash flow streams is the possibility of multiple conflicting internal rates of return. In this paper we p...

متن کامل

A Competition Strategy to Cost-Sensitive Decision Trees

Learning from data with test cost and misclassification cost has been a hot topic in data mining. Many algorithms have been proposed to induce decision trees for this purpose. This paper studies a number of such algorithms and presents a competition strategy to obtain trees with lower cost. First, we generate a population of decision trees using λ-ID3 and EG2 algorithms through considering info...

متن کامل

Cost-Sensitive Decision Trees with Pre-pruning

This paper explores two simple and efficient pre-pruning strategies for the cost-sensitive decision tree algorithm to avoid overfitting. One is to limit the cost-sensitive decision trees to a depth of two. The other is to prune the trees with a pre-specified threshold. Empirical study shows that, compared to the error-based tree algorithm C4.5 and several other cost-sensitive tree algorithms, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011